89 research outputs found
An Improved Encoder-Decoder Framework for Food Energy Estimation
Dietary assessment is essential to maintaining a healthy lifestyle. Automatic
image-based dietary assessment is a growing field of research due to the
increasing prevalence of image capturing devices (e.g. mobile phones). In this
work, we estimate food energy from a single monocular image, a difficult task
due to the limited hard-to-extract amount of energy information present in an
image. To do so, we employ an improved encoder-decoder framework for energy
estimation; the encoder transforms the image into a representation embedded
with food energy information in an easier-to-extract format, which the decoder
then extracts the energy information from. To implement our method, we compile
a high-quality food image dataset verified by registered dietitians containing
eating scene images, food-item segmentation masks, and ground truth calorie
values. Our method improves upon previous caloric estimation methods by over
10\% and 30 kCal in terms of MAPE and MAE respectively.Comment: Accepted for Madima'23 in ACM Multimedi
Personalized Food Image Classification: Benchmark Datasets and New Baseline
Food image classification is a fundamental step of image-based dietary
assessment, enabling automated nutrient analysis from food images. Many current
methods employ deep neural networks to train on generic food image datasets
that do not reflect the dynamism of real-life food consumption patterns, in
which food images appear sequentially over time, reflecting the progression of
what an individual consumes. Personalized food classification aims to address
this problem by training a deep neural network using food images that reflect
the consumption pattern of each individual. However, this problem is
under-explored and there is a lack of benchmark datasets with individualized
food consumption patterns due to the difficulty in data collection. In this
work, we first introduce two benchmark personalized datasets including the
Food101-Personal, which is created based on surveys of daily dietary patterns
from participants in the real world, and the VFNPersonal, which is developed
based on a dietary study. In addition, we propose a new framework for
personalized food image classification by leveraging self-supervised learning
and temporal image feature information. Our method is evaluated on both
benchmark datasets and shows improved performance compared to existing works.
The dataset has been made available at:
https://skynet.ecn.purdue.edu/~pan161/dataset_personal.htmlComment: Accepted by IEEE Asilomar conference (2023
Muti-Stage Hierarchical Food Classification
Food image classification serves as a fundamental and critical step in
image-based dietary assessment, facilitating nutrient intake analysis from
captured food images. However, existing works in food classification
predominantly focuses on predicting 'food types', which do not contain direct
nutritional composition information. This limitation arises from the inherent
discrepancies in nutrition databases, which are tasked with associating each
'food item' with its respective information. Therefore, in this work we aim to
classify food items to align with nutrition database. To this end, we first
introduce VFN-nutrient dataset by annotating each food image in VFN with a food
item that includes nutritional composition information. Such annotation of food
items, being more discriminative than food types, creates a hierarchical
structure within the dataset. However, since the food item annotations are
solely based on nutritional composition information, they do not always show
visual relations with each other, which poses significant challenges when
applying deep learning-based techniques for classification. To address this
issue, we then propose a multi-stage hierarchical framework for food item
classification by iteratively clustering and merging food items during the
training process, which allows the deep model to extract image features that
are discriminative across labels. Our method is evaluated on VFN-nutrient
dataset and achieve promising results compared with existing work in terms of
both food type and food item classification.Comment: accepted for ACM MM 2023 Madim
Multi-Position Identification of Joint Parameters in Ball Screw Feed System Based on Response Coupling
Existing methods of parameters identification do not consider the torsion characteristics of a ball screw and the worktable position simultaneously. Therefore, this paper proposes a multi-position identification method based on receptance coupling. Firstly, the mathematical model of the feed drive system is established by the improved receptance coupling, and this model considers both axial and torsional vibration of the ball screw. Secondly, the identification equation is established by minimum error of the modal parameters of multiple worktable position, and differential evolution algorithm is used to calculate the stiffness and damping of the joint. Finally, the self-developed ball screw feed drive system is used for experimental study. The maximum error of the first four orders of natural frequencies predicted through multi-position identification results is 2.95%, and the multi-position method is more robust than the common method identification at one position. The experiment study showed that the proposed method is accuracy and necessity
Lossy Image Compression with Quantized Hierarchical VAEs
Recent research has shown a strong theoretical connection between variational
autoencoders (VAEs) and the rate-distortion theory. Motivated by this, we
consider the problem of lossy image compression from the perspective of
generative modeling. Starting with ResNet VAEs, which are originally designed
for data (image) distribution modeling, we redesign their latent variable model
using a quantization-aware posterior and prior, enabling easy quantization and
entropy coding at test time. Along with improved neural network architecture,
we present a powerful and efficient model that outperforms previous methods on
natural image lossy compression. Our model compresses images in a
coarse-to-fine fashion and supports parallel encoding and decoding, leading to
fast execution on GPUs. Code is available at
https://github.com/duanzhiihao/lossy-vae.Comment: WACV 2023 Best Algorithms Paper Award, revised versio
- ā¦